Interactive Debugging and Performance Analysis of Massively Parallel Applications
نویسندگان
چکیده
In the eld of high performance computing, massively parallel processing systems (MPPs) get more and more important. A rising number of complex applications is parallelized for execution on these machines. Still a signiicant portion of the time needed for parallelization is spent for the process of debugging and performance tuning. A main reason for this fact is the absence of adequate tools supporting this phase of program development. In this article, we present a novel tool environment, consisting of a parallel debugger (DETOP), a performance analyzer (PATOP), and a common monitoring system for PowerPC-based parallel computers. The environment speciically addresses the topics of scalability, usability for dynamic, multithreaded programming models, minimal intrusion, debugging and tuning methodology and comfortable user interfaces. We derive requirements for tools monitoring the runtime behavior of parallel programs, present the concepts used to meet these requirements in our tool environment, and describe its implementation and its usage. DETOP is based on the event-action paradigm and supports both data parallel codes and programs based on functional decomposition. Special features are provided for applications that dynamically create new threads or consist of multiple executables. PATOP supports a systematic search for performance bottlenecks in massively parallel applications using the concept of attributed measurements and distributed evaluation. Both tools are based on a common, distributed on-line monitoring system providing the necessary runtime information.
منابع مشابه
A Tool for On-line Visualization and Interactive Steering of Parallel HPC Applications
Tools for parallel systems today range from specification over debugging to performance analysis and more. Typically, they help the programmers of parallel algorithms from the early development stages to a certain level of program optimization. However, in HPC (High Performance Computing) today the end-user of massively parallel CFD (Computational Fluid Dynamics)-programs has little or no suppo...
متن کاملGrid-based Workflow Management for Automatic Performance Analysis of Massively Parallel Applications
Many Grid infrastructures have begun to offer services to end-users during the past several years with an increasing number of complex scientific applications and software tools that require seamless access to different Grid resources via Grid middleware during one workflow. End-users of the rather hpc-driven deisa Grid infrastructure take not only advantage of Grid workflow management capabili...
متن کاملAnnai Scalable Run-Time Support for Interactive Debugging and Performance Analysis of Large-Scale Parallel Programs
The Annai tool environment helps exploit distributed-memory parallel computers with High Performance Fortran and/or explicit communication, using MPI as a portable machine interface. Integration within a unified environment allows the component parallelization and compilation support, debugging and performance tools to synergetically use common facilities. Additionally, massive quantities of pa...
متن کاملThe Illinois Concert System: Programming Support for Irregular Parallel Applications
Irregular applications are critical to supporting grand challenge applications on massively parallel machines and extending the utility of those machines beyond the scientiic computing domain. The dominant parallel programmingmodels, data parallel and explicit message passing, provide little support for programming irregular applications. We articulate a set of requirements for supporting irreg...
متن کاملFlexible performance visualization of parallel and distributed applications
Performance debugging of parallel and distributed applications can benefit from behavioral visualization tools helping to capture the dynamics of the executions of applications. The Pajé generic tool presented in this article provides interactive and scalable behavioral visualizations; because of its genericity, it can be used unchanged in a large variety of contexts. © 2002 Elsevier Science B....
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Parallel Computing
دوره 22 شماره
صفحات -
تاریخ انتشار 1996